{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 转置卷积\n",
    "\n",
    "到目前为止，我们所见到的卷积神经网络层，例如卷积层和汇聚层，通常会减少下采样输入图像的空间维度（高和宽）。然而如果输入和输出的空间维度相同，在以像素级分类的语义分割将会很方便，例如，输出像素所处的通道维可以保有输入像素在同一的分类结果。\n",
    "\n",
    "为了实现这一点，尤其是空间维度被卷积神经网络缩小后，我们可以使用另一种类型的卷积神经网络层，它可以增加上采样中间特征图的空间维度。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {},
   "outputs": [],
   "source": [
    "import torch\n",
    "from torch import nn\n",
    "import d2lzh_pytorch as d2l"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {},
   "outputs": [],
   "source": [
    "def trans_conv(X,K):\n",
    "    h,w=K.shape\n",
    "    Y=torch.zeros((X.shape[0] + h - 1, X.shape[1] + w - 1))\n",
    "    for i in range(X.shape[0]):\n",
    "        for j in range(X.shape[1]):\n",
    "            Y[i : i+h,j : j+w]+=X[i,j]*K\n",
    "    return Y"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "与通过卷积核 “减少” 输入元素的常规卷积相比，转置卷积通过卷积核“广播”输入元素，从而产生大于输入的输出。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "tensor([[ 0.,  0.,  1.],\n",
       "        [ 0.,  4.,  6.],\n",
       "        [ 4., 12.,  9.]])"
      ]
     },
     "execution_count": 42,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "X = torch.tensor([[0.0,1.0], [2.0,3.0]])\n",
    "K = torch.tensor([[0.0,1.0], [2.0,3.0]])\n",
    "trans_conv(X,K)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "或者，当输入X和卷积核K都是四维张量时，我们可以使用高级API获得相同的结果。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "tensor([[[[ 0.,  0.,  1.],\n",
       "          [ 0.,  4.,  6.],\n",
       "          [ 4., 12.,  9.]]]], grad_fn=<SlowConvTranspose2DBackward0>)"
      ]
     },
     "execution_count": 43,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "X, K = X.reshape(1, 1, 2, 2), K.reshape(1, 1, 2, 2)\n",
    "tconv = nn.ConvTranspose2d(1, 1, kernel_size=2, bias=False)\n",
    "tconv.weight.data = K\n",
    "tconv(X)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 填充、步幅和多通道\n",
    "\n",
    "与常规卷积不同，在转置卷积，填充被应用于的输出（常规卷积将填充应用于输入）"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "tensor([[[[0., 2.],\n",
       "          [2., 0.]]]], grad_fn=<SlowConvTranspose2DBackward0>)"
      ]
     },
     "execution_count": 44,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "tconv = nn.ConvTranspose2d(1, 1, kernel_size=2, stride=2,padding=1, bias=False)\n",
    "tconv.weight.data = K\n",
    "tconv(X)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 45,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "X = torch.rand(size=(1, 10, 16, 16))\n",
    "conv = nn.Conv2d(10, 20, kernel_size=5, padding=2, stride=3)\n",
    "tconv = nn.ConvTranspose2d(20, 10, kernel_size=5, padding=2, stride=3)\n",
    "tconv(conv(X)).shape == X.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "tensor([[27., 37.],\n",
       "        [57., 67.]])"
      ]
     },
     "execution_count": 46,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "X = torch.arange(9.0).reshape(3, 3)\n",
    "K = torch.tensor([[1.0, 2.0], [3.0, 4.0]])\n",
    "Y = d2l.corr2d(X, K)\n",
    "Y"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "tensor([[0., 1., 2.],\n",
       "        [3., 4., 5.],\n",
       "        [6., 7., 8.]])"
      ]
     },
     "execution_count": 47,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "X"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "pytorch_gpu",
   "language": "python",
   "name": "pytorch_gpu"
  },
  "orig_nbformat": 4
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
